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Here we present a program aimed at free-energy calculations in molecular systems. It consists 
of a series of routines that can be interfaced with the most popular classical molecular dynamics 
(MD) codes through a simple patching procedure. This leaves the possibility for the user to exploit 
many different MD engines depending on the system simulated and on the computational resources 
available. Free-energy calculations can be performed as a function of many collective variables, with 
a particular focus on biological problems, and using state-of-the-art methods such as metadynamics, 
umbrella sampling and Jarzynski-equation based steered MD. The present software, written in ANSI- 
C language, can be easily interfaced with both fortran and C/C++ codes. 
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I. INTRODUCTION 



Free-energy calculations have a fundamental role in the understanding of many natural phenomena ranging from 
protein folding up to polymorphic transitions in solids. To this end, molecular dynamics (MD) has been extensively 
used together with a series of algorithms aimed at extending its capabilities far beyond those allowed by straightforward 
MD. Among some of the most popular enhanced sampling techniques are umbrella sampling [3, HI Q , Jarzynski equation 
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based methods [U, adaptive force bias [f| and metadynamics @. 

At the same time, a multitude of MD codes have been developed through the years with focus on different fields 
of application. Some of these programs are more solid-state oriented, with a particular attention to the variety of 
implemented potentials (DL_POLY 0). Others are more specialized in biomolecular systems, with specific potentials 
developed to that scope (CHARMMjp, GROMACS [| and NAMD pi), or implicit solvent capabilities (CHARMM, 
AMBER [n| and GROMOS [H, fl3j]). Recently, a lot of effort has also been devoted to efficient parallelization, 
allowing a linear scaling reduction of computational time with the number of processors. In this respect, NAMD, 
DESMOND [1J|, GROMACS and pmemd in AMBER are currently among the best performing programs. 
This wealth of codes provides the user a wide range of capabilities, but only few of them offer interfaces for specific free- 
energy methods, often with a limited set of collective variables. In particular, metadynamics has been implemented 
separately for some of these programs, and so far the only one freely distributed is GROMETA [l5| for GROMACS. 
This prompted us to develop a plugin compatible with many of the aforementioned codes so as to facilitate free-energy 
calculations with a unified input. 



II. THEORETICAL BACKGROUND 



A. Free-energy methods 

We consider a system made of N atoms characterized by microscopic coordinates r £ K 3Ar and a potential energy 
function U(r). We then introduce a set of d collective variables (CVs), s(r), where s £ M. d . These variables are used 
as order parameters, i.e. to distinguish between macroscopically different configurations. The Helmholtz free energy 
as a function of these CVs is defined as: 

F(s) = -~ In J dr e- 0U ^S(s - s(r)) + C, (1) 

where C is an immaterial constant. The free energy F(s) contains crucial informations about the thermodynamics 
of the system, and allows the calculation of the ensemble average of any observable that depends on the CVs s. 
Moreover, when the CVs are properly chosen, the free-energy profile can be used to model the out-of-equilibrium 
behavior of the system by means of a stochastic dynamics [l|| E3, EH, Ell • 

The simplest way to obtain the Helmholtz free energy from an unbiased MD simulation is to evaluate the histogram 
of the visited configurations in the CVs space N(s), so that the estimated free energy F(s) reads: 

F(s) = ~lnN(s). (2) 

However, in presence of rare events this method is completely impractical, since it would require an enormous com- 
putational time. Many methods have been proposed to tackle the rare-event problem and to calculate free-energy 
profiles. Some of them are aimed at enhancing the sampling of the canonical ensemble, so that the free energy is 
still estimated by Eq. ([2]). This class includes methods such as simulated tempering [2(1, parallel tempering 2l|, 



Hamiltonian replica exchange [23| and solute tempering [24|. Here we concentrate on a second class of methods, 
which are based on collecting configurations in a biased ensemble and require one to select the set of CVs prior to 
the simulation. The prototype of all these methods is umbrella sampling where the simulation is performed with 
a fixed additional bias potential V(s), and the unbiased free energy is recovered as 

F(s) = —In N(s)-V(s). (3) 

While Equation is valid for any choice of the bias potential V(s), the efficiency of the sampling and convergence 
properties are strongly dependent on V{s). In particular, if an approximate free-energy estimate F'(s) is available 
before the simulation, the choice V(s) = —F'(s) would give an approximately flat histogram, thus helping in over- 
coming the free-energy barriers. However, it is rather difficult to have a reliable free-energy estimate before the 
simulation. Various improvements have been introduced so as to refine the bias potential on the fly (see, among 
others, Refs. [EL 15 1251 1291 1. We focus here on metadynamics, in its standard form [H, HtJ or in the recently introduced 
well-tempered flavor [281 ] - In particular, we consider the direct version of metadynamics, where the bias is acting 
directly on the microscopic coordinates. For an excellent review of several variants of metadynamics see Ref. [29j | . 

In standard metadynamics the bias potential is built during the simulation as a sum of Gaussian functions centered 
on the previously visited configurations in the CVs space. This manner of biasing the evolution by discouraging the 
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visited configurations was first introduced in the taboo search [30] and, in the context of MD, by the local elevation 
method [3l|. The approach is also closely related to the Wang and Landau algorithm [25[, adaptive force bias [l| 
and self-healing umbrella sampling 26]. In metadynamics, the bias at time t is written as an integral on the past 
trajectory r(t): 



V(s,t)= [ cft'wexp(-V 
J o \ i=l 



( Si (r) - sMt')f 
2a? 



(4) 



Here cr^ is the Gaussian width corresponding to the i-th CV and represents the resolution for that CV, and to is the 
rate at which the bias grows. As it was shown empirically [2?| and analytically [32| for model Langevin dynamics, 
in the long run the bias will converge to the negative of the free energy and then oscillate around that profile. As 
a consequence, the final histogram will be approximately flat in the CVs space, allowing for an uniform exploration 
in spite of the free-energy barriers. We also observe that the bias performs a work on the system, which needs to 
be dissipated. Usually during a metadynamics simulation the thermostat keeps the system in thermal equilibrium, 
unless the growth rate of the bias is too large. 

Metadynamics has been historically used in two different and complementary manners. It has been used to escape 
free-energy minima (see e.g. Ref. 33]). i.e. to find a reasonable saddle point out of a local minimum. In this 
case, metadynamics should be stopped as soon as the system exits from the minimum and starts exploring a new 
region of space. In other applications, it has been used to exhaustively explore the CV space and reconstruct the 
free energy. In the examples presented in this paper, we focus more on this latter application. Its main advantage 
over umbrella-sampling technique is that it inherently explores the region of low free energy first. Here the simulation 
should be stopped when the motion of the CVs becomes diffusive in this region of interest, and the bias itself can be 
used as an estimate of the underlying free energy: 

F(s,t) = -V(s,t) (5) 

(note that if s does not include all relevant order parameters the bias may not converge in a reasonable simulation 
time). The free-energy estimate at time t, F(s,t), is indeed an unbiased estimator of the exact free energy F(s) [3^ ]. 
However, F(s, t) fluctuates around F(s) with an amplitude which depends on both the diffusion coefficient in the CV 
space and on the metadynamics parameters u) and a, and a more accurate calculation can be performed decreasing u). 
Clearly, a smaller uj means that more time is required to reconstruct the free-energy landscape, therefore a compromise 
needs to be found between speed and accuracy. One can also exploit the fact that F(s,t) is an unbiased estimate of 
F(s) at all times, and take the time average of all the profiles as done, for instance, in Ref. [34]. However, as the 
simulation continues, configurations of higher and higher free energy are explored and, in order to take the average it 
is necessary to force the system to remain inside the region by a suitable restraining potential. 

An alternative approach is the recently introduced well-tempered metadynamics [281 ] . Well-tempered metadynamics 
is a variant of the method that solves the problem of the fluctuations in a different way, and is more suitable for 
performing free-energy calculations in several dimensions since it allows avoiding the complication of restraining the 
dynamics inside a region. In the well-tempered algorithm, the rate at which the bias is grown is decreased during the 
simulation proportional to e - ^ 8 '*" AT , where AT is a characteristic energy: 



V(s,t) = J* dt'u>e- v ( a WW AT exp 



( Sl (r) - Sl (r(t')) 2 
2a? 



(0) 



Over long time, it can be shown that the bias converges to a fraction of the exact free energy: 



]imV(s,t) = --^—F(s). (7) 
Conversely, it is possible to estimate the free energy as: 

AT + T 

F(s,t) = ^V(s,t). (8) 

This estimator does not suffer of the fluctuation problem of standard metadynamics. Moreover at variance with 
standard metadynamics, the exploration for large times will not be uniform in the CV space but instead it will satisfy 
the probability distribution: 

P(s) oc e -n«)/(T+AT)_ (9) 
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Thus, the CVs will be sampled at a finite but arbitrarily high temperature T + AT. This is a rather important feature 
of well-tempered metadynamics, especially for d > 1, since it allows to focus the exploration on the low free-energy 
regions, such as the main minima and the saddle points. The explored free-energy range can be varied by tuning AT, 
and standard metadynamics is recovered for AT — > oo. 

In practical implementations, the bias is updated with a finite temporal stride t g , so that: 

V(s, t) = £ We- V ^')),n/AT exp f £ (si(r) -^(r(t')f \ (1Q) 

t'=0,T G ,2i- G ,... \ t=l 0i ) 

where W = tquj is the height of a single Gaussian. 



B. Metadynamics implementation 

A metadynamics implementation should perform two basic tasks: (a) keep track of the visited configurations in 
the CV space or, equivalently, of the shape of the bias potential and (b) add the proper forces to the microscopic 
dynamics. 

The first task is accomplished by maintaining a list of the Gaussians which have been added to the bias. This list 
is dynamic and grows during the simulation. The list is also stored in a file that can be used to restart a simulation, 
and to plot the bias with an external utility. 

The second task requires evaluation of the bias forces, i.e. the derivatives of the bias potential with respect to the 
microscopic coordinates r. This derivative is calculated using the chain rule: 

dV(a,t) _" dV(s,t) d Sj (r) 
dr. i ds.; Or, 

The first part of the derivative is simply a sum of analytical derivatives of Gaussian functions. This sum gets more 
and more expensive as the simulation proceeds and the number of Gaussians grows (35j . Usually this is not a problem 
since, if we exclude the simplest test cases, this effort is incomparably smaller than that of evaluating the force-field. 
For specific needs, an implementation based on the storage of the bias potential on a grid could be faster. The second 
part of the derivative in Eq. (fTTj) depends on the specific choice for the CVs. Thus, for each of the CVs that one 
wants to use, it is necessary to provide routines which, given the microscopic coordinates, return the value of the 
CV and its gradients. Writing and debugging these routines for a large number of CVs requires a noticeable effort. 
However, it is worthwhile pointing out that this effort is the main ingredient of many other free-energy methods, and 
thus our plugin has been adapted to perform other kinds of free-energy calculations, such as umbrella sampling 0, Q 
or Jarzynski equation [4| based methods, as it will be discussed in Section [III1 



C. Collective variables 



The implementation of many different CVs is required to deal with the huge variety of problems of interest and to 
rive a proper description of each. Here we describe all the possibilities present in the current package. 

• Atom position. The absolute position of an atom or a group of atoms. This CV is implemented with several 
options that allow the user to restrict the bias to a given direction, e.g. z, or to bias the position of the particle 
as projected onto a selected segment or, in analogy with the path CV, to bias the atoms distance from a segment. 
This variable is not translationally invariant. 

• Distance. The distance between two atoms or, more generally, the distance between the centers of mass of two 
groups of atoms identified as G\ and G*2 : 



Sdist — 
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;eG : 



rcuvi 



E 



6G : 



m i r l 



rrii 



EiGGi 

= \r Gl ~r G2 \, 



E, 



EG 



rrii 



(12) 
(13) 



where m; and are the mass and the position of the i — th atom respectively, and r Gl and r G2 are the centers 
of mass of the two groups. 
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• Angle. The angle denned by three atoms or, more generally, the angle defined by the centers of mass of three 
groups of atoms identified as G\, G 2 and G3: 

r c = r Gl - r G2 (14) 
n = r Gl - r Gs (15) 
r a = r G2 - r G3 (16) 

s angle = cos- 1 [ rl ^\ r J l ) ■ (17) 

• Torsion. The dihedral angle defined by four atoms or, more generally, the dihedral angle defined by the centers 
of mass of four groups of atoms identified as G\, G 2 , G 3 and G4: 

r a = r Gi - r Gs (18) 

n = r G2 - r Gs (19) 

r c = r G3 - r G . 2 (20) 

r d = r Gl - r G2 (21) 
-1 ( (r a x r b ) ■ (r c x r d ) \ 

^torsion — COS I . .. . I . l^^j 

V \r a x r b \\r c x r d \ J 

• Minimum distance. The distance between the two closest atoms pertaining to two different groups G\ and 
G 2 , approximately obtained with the following expression: 

Smindist — 7 ~ (^^) 

where & is a user-supplied smoothing parameter. 

• Coordination. The coordination number of one atom, or more atoms, with respect to another atom or group 
of atoms (e.g. the coordination of an ion with respect to all the water molecules in the simulation box): 

Scoord ^ ^ ^ ^ ^iji (24) 

where 



1 if \ri — rj\ < S 

1- ( ' r ° ^ if \n-rj\>6. (25) 



The user-supplied parameters tq, 5, n and m allow a great flexibility to fine-tune the decay of the switching 
function, e.g. a more accurate counting of the atoms in the coordination shell. In general a good guess for these 
parameters can be achieved by looking at the pair distribution function between the first and the second group 
of atoms. A good starting point is to take 5 as the position of the first peak in the pair distribution function, 
ro as the full width at half maximum of the peak and n and m to force Sjj ~ at the first minimum of the pair 
distribution function. However, depending on the system properties, different choices may give better results. 

• Hydrogen bonds. The number of intra-protein hydrogen bonds, defined as: 

Shbonds=Y\ V f(i,j) ), r ° ,/ m , (26) 

' ' 1 I \ r i- r j \ 

teGojeGn 1 - [ ro j 

where Go the group of oxygen atoms of the protein, Gr the group of hydrogen atoms of the protein. Typically, 
r = 2.5 A, n — 6 and m = 10. The function f(i,j) selects a particular type of hydrogen bonds, depending on 
the user choice: all, a-helix pattern, /3-strand pattern (even or odd). 
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• Interfacial water. This variable is intended to calculate the number of atoms of a certain group Go that are 
in contact with atoms of both groups G\ and G2 at the same time {e.g. the number of waters at the interface 
of two surfaces): 

Swaterbridge - Z^i / l, r . I \ m / |_._-., I \ " ^ > 



1 - (— ) / v*^ 1 - (^) 



More precisely, the variable counts the number of atom triples (i S Go, j G G\,k 6 G2) with i in contact with 
both j and (see also coordination parameters). 

Radius of gyration. The radius of gyration of a group G, defined as: 



^gyration — \ ^—^ ■ 

V Z^ieG m * 

Similarly, one can be interested in the trace of the inertia tensor, which can be shown to be equal to: 

gyration f^m,. (29) 

• Dipole moment. The electric dipole of a group of atoms: 



ole = (30) 



where qi is the charge of each atom i. 

• Dihedral correlation. This variable measures the degree of similarity of a list of adjacent dihedral angles. It 
is defined by: 



Sdihe 



.Ell + cos^^i)), (31) 



where (f>i is the dihedral angle defined by four atoms. For proteins, the variable grows with the content of 
secondary structure. 

• Alpha-beta similarity. This variable measures the similarity of dihedral angles with respect to reference 
values: 



1 n 

±£(l + cos (32) 



s a -p 

i=i 

where reference dihedrals 0[ e ^ are given as input. For proteins, this variable can be use to measure the amount 
of a or [3 secondary structure. 

• Torsional rmsd. Root of mean square deviation of selected dihedral angles with respect to a reference config- 
uration: 



^torsrmsd — \/ j [OO J 

V n 

where n is the number of reference dihedrals. 

• Path collective variables. Path collective variables are a general approach based on a previous (approximate) 
knowledge of the reaction path [36[ . If one assumes that the transition from A to B can be described by a set of 
CVs S(r), which are in general non-linear vectorial functions of the microscopic variables r, then it is possible 
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to define two associated variables. One aims at measuring the progress along a parametric path in CVs space 
S(7) composed of P frames: 



spP 1 -A||S(r)-S(Z)|| 2 



z (r) = -iln(^e-W)- s WII 2 ), (35) 



.-, t e 

^^^lA||S(r)-S(OI 

and the other measures the distance from the closest point along the path: 

p 

i=i 

where ||. . .|| is the metric that defines the distance between two configurations. With such definitions s (r) is 
a pure number that ranges from 1 to P and z (r) has the dimension of the chosen distance, squared. The 
parameter A is in general chosen as 2.3/(Ad) 2 where Ad is the average distance among adjacent frames. 

The definition of the S(Z) and of the measure ||. . .|| can be chosen by the user among the following: 

— Root mean square displacement in Cartesian coordinates: The path S(Z) may be defined as a series 
of configuration in Cartesian space. Each configuration is made of a subset of atoms of the system and the 
distance is calculated as root mean square displacement (RMSD) after optimal alignment [371 ]. 

— RMSD in distances: the path S(Z) must be defined as a set of pair distance among atoms. The RMSD 
is calculated as a difference of distances: 



V Ndist 

where di is the distance between the atoms of the i-th couple and Ndist is the total number of couples 
considered [38l. [39j. 

— Contact map distance: the path S(Z) is defined by a set of contact maps [40(, where each contact is 
defined as in Eq. (|26|) . The distance is defined as: 



|S(r)-S(i)|| = 



\ 



E^W-W, (37) 



where Cj(r) is the i-th contact for the configuration r and N con t is the total number of contacts considered. 

The CV z (r), when used with a single reference frame or contact map, is equivalent to the distance between a 
configuration and the reference structure, measured in the chosen metric (squared). Therefore, this CV should 
be used to reproduce the standard RMSD, distance RMSD or CMAP distance. 



III. USAGE EXAMPLES 



When a given program is instructed to use PLUMED (see the manual for specific implementations), a supplementary 
input file for the free-energy calculation must be provided. 

During the calculation the main output is a file containing a record of the values of CVs. This file is generally called 
COLVAR. It may contain also additional informations depending on the chosen free-energy method. 
In the following we illustrate the basic use of the different methods, available for all the MD codes specified in section 
IVI1 and of the algorithms currently implemented only in the GROMACS version. Many additional examples of CVs 
and sampling techniques, such as multiple walkers metadynamics [41j . are contained in the test directory distributed 
with the code. 



A. Metadynamics 



A generic input for metadynamics appears as follows: 
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# switching on met adynamics and Gaussian parameters 
HILLS HEIGHT 0.1 W_STRIDE 100 

# well-tempered metadynamics 
WELLTEMPERED SIMTEMP 300 BIASFACTOR 10 

# instruction for CVs printout 
PRINT W_STRIDE 50 

# a simple CV: the distance between atoms 

# or group of atoms (in this case between atom 13 and atoms group <gl>) 
DISTANCE LIST 13 <gl> SIGMA 0.35 

gl-> 

17 20 22 30 
gl<- 

# wall potential 

UWALL CV 1 LIMIT 15.0 KAPPA 100.0 EXP 4.0 EPS 1.0 OFF 0.0 

# end of the input 
ENDMETA 

Three kinds of keyword may exist: the directive (needed keyword to be placed in the first position along a line that 
specifies the intent of the following keywords), the parameter keyword (which specifies the attribute in the following 
field/s ) and flags which simply turn on or off a given option. 

The HILLS is a directive and switches on metadynamics. 
The line containing this keyword also sets up the parameters for Gaussians deposition: HEIGHT (parameter keyword) 
followed by the Gaussian height in the energy unit of the program chosen and W_STRIDE (parameter keyword), which 
specifies the time between the deposition of two consecutive Gaussians (in number of timesteps). This input has the 
effect of producing an additional file, called HILLS, which has the following layout: 

20.000 2.78975 0.35000 0.11111 10.000 

40.000 2.94914 0.35000 0.10926 10.000 

60.000 2.75472 0.35000 0.10737 10.000 

80.000 2.76470 0.35000 0.10542 10.000 

This is a record of the Gaussians put during the run: it displays the time step, the center of the Gaussian (one for 
each CV), the width (one for each CV) and the height. In case of well-tempered metadynamics the Gaussian height 
is rescaled using the bias factor printed in the last column in order to directly obtain the free energy (and not the 
bias), when summing all the Gaussians deposited during the run. 

The WELLTEMPERED directive switches on well-tempered metadynamics. As explained in section III Al CVs are 
sampled at a fictitious higher temperature T + AT determined by the bias factor (T + AT) /T. The user must specify 
this bias factor using the keyword BIASFACTOR and the system temperature using SIMTEMP. 

The PRINT directive allows one to monitor, during the simulation, the evolution of the CVs between the deposition 
of two Gaussians. The CVs values are printed on the C0LVAR file with a frequency, expressed in timestep units, 
controlled by the parameter keyword W_STRIDE. The file produced looks as follows: 



0. 


,000 


2, 


.26464 


0. 


.000 


0, 


,000 


10. 


.000 


2. 


,40452 


0. 


.000 


0, 


.000 


20. 


.000 


2. 


,78975 


0. 


.100 


0, 


.000 


30. 


.000 


3. 


,06159 


0. 


.074 


0, 


.000 


40. 


.000 


2. 


,94914 


0. 


.188 


0, 


.000 


50. 


,000 


2, 


,76442 


0. 


.185 


0. 


,000 



where the first column is the time step, the next contains the CV value (one for each CV), the third is the bias 
potential and the last the potential due to a wall or a restraint. 

The DISTANCE directive selects the CV (in this case the distance between the center of mass of two groups of atoms). 
LIST (parameter keywords) specifies the two atoms or group of atoms whose distance is calculated. The atom indices 
range from 1 to N at in the order they appear in a reference structure produced by the program. In case of a group of 
atoms, the name of the group must be specified between brackets <>. The list of atoms belonging to the group can 
be placed anywhere in the input file. 

The parameter keyword SIGMA specifies the Gaussian width in CV units. 

The UWALL directive switches on a wall potential on the collective variable CV. This potential starts acting on the 
system when the value of the CV is greater (or lower in the case of LWALL) then a certain limit (LIMIT) minus an 
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offset (OFF). The functional form of this potential is the following: 



( 



LIMIT + OFF 



) 



EXP 



V wa ii{s) = KAPPA 



•S 



(38) 



EPS 



where KAPPA is an energy constant in internal unit of the code, EPS a rescaling factor and EXP the exponent determining 
the power law. 

The multiple definition of CVs is allowed. The directive ENDMETA specifies the end of the input. All the following 
text will be discarded. The symbol # is a comment line which is ignored. 



A general input for umbrella sampling calculation is the following: 

# a simple CV: a dihedral angle 
TORSION LIST 13 15 17 1 

# switching on umbrella sampling and parameters 
UMBRELLA CV 1 KAPPA 200 AT -1.0 

# instruction for CVs printout 
PRINT W_STRIDE 100 

# end of the input 
ENDMETA 

The directive UMBRELLA switches on umbrella sampling on the collective variable specified by the parameter keyword 
CV (in this case the first CV that appears in the input). The position sq of the umbrella restraint is determined by 
the keyword AT, and the spring constant - whose energy units depend on the MD code used - by the keyword KAPPA. 
The functional form of the potential is the following: 



The directive TORSION selects the type of CV, in this case a dihedral angle defined by four atoms or group of atoms. 
The CVs value is printed on the C0LVAR file with a stride fixed by the keyword W_STRIDE. In case of umbrella sampling, 
this file looks as follows: 
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1 


-1 


.00000 



This file contains, from left to right: the time step, the CV value (one for each CV), the potential coming from the 
Gaussians, the harmonic potential of umbrella sampling, the CV on which the restraint acts and the position of the 
restraint. The final calculation of the free energy as a function of this CV can be done using the weighted histogram 
analysis method, choosing one of the many possible implementations (see section IVIIBI) . 



PLUMED can be used to drag a system to a target value in CV space using an harmonic potential moving at constant 
speed. If the process is reversible, i.e. for velocities tending to zero, the work done in the dragging corresponds to the 
free-energy difference between the initial and the final states. In case of finite velocity, it is still possible to obtain an 
estimate of the free energy from the work distribution using Jarzynski Q or Crooks [42j relations. 

A general input for a steered MD calculation is the following: 

# a simple CV: a dihedral angle 
ANGLE LIST 13 15 17 

# switching on steered MD 



B. Umbrella Sampling 



V umb (s) = -KAPPA(.s - s ) 2 . 



(39) 



C. Thermodynamic integration and methods based on Jarzynski or Crooks relations 
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STEER CV 1 TO 3.0 VEL 0.5 KAPPA 500.0 

# instruction for CVs printout 
PRINT W_STRIDE 100 

# end of the input 
ENDMETA 

The keyword STEER activates the steering on the collective variable specified by CV. The target value is determined 
by the parameter keyword TO, the velocity, in unit of CV/kilostep, by VEL and the spring constant by KAPPA. The 
functional form of the dragging potential is the same as the one of formula |3"51 

The directive ANGLE selects the type of CV, in this case an angle defined by three atoms or group of atoms. 

The printout on C0LVAR file is analogous to umbrella sampling with the difference that in this method the restraint 

position changes during the run. 



D. Replica— exchange metadynamics 

When combined with GROMACS (both version 3.3 and 4), PLUMED can perform replica-exchan ge s imulations 
coupled with metadynamics in two different ways: parallel tempering metadynamics (PTMetaD) ill, 43| and bias- 
exchange metadynamics (BE-META) Q. 

PTMetaD is particularly useful to increase the diffusion of the system in conformational space. It consists in defining 
several replicas of the system, controlled by the same CVs but coupled with thermal baths at different temperatures 
T,;. As in standard parallel tempering [2lT |22||. pairs of replicas can exchange at a given time t two conformations Vi 
and Tj. The probability of such an exchange is given by: 

Pij = min(l,exp[(/3, - (3 % ){U{r ) - U{n)) + A(V<(*(r<),i) - Vi{s( rj ),t)) 

+ foiVMr^t) *))]), (40) 

where f3i = 1/KsTi is the inverse temperature, U(r) the internal potential, Vi and Vj the biasing potentials deposited 
by the two replicas. The effect of this algorithm is to sample the degrees of freedom perpendicular to the CVs more 
efficiently with respect to standard metadynamics. 

BE-META is designed to sample the system making use of a large number of CVs without the need of filling with 
Gaussians a high-dimensional space [H| . This is done employing several replicas of the system, controlled by a few 
different CVs for each replica. Usually one defines also a "neutral" replica, which evolves according to standard MD, 
i.e. without metadynamics. The temperature of the system is the same for all replicas. 

The exchange probability for a pair of replicas i and j is: 

Pij =mm(l,eMP[Vi(s(r i ),t) + V j (s(r j ),t)-V i (s(r j ),t)- Vj(a(ri),t)])). (41) 

To run PTMetaD simulations, one has to follow the standard GROMACS procedure for parallel tempering (see 
GROMACS manual). A binary topology file must be prepared one for each replica, while only one PLUMED input 
file is required. This file looks as follows: 

# switching on metadynamics and Gaussian parameters 
HILLS HEIGHT 0.1 W_STRIDE 500 

# switching on PTMetaD 
PTMETAD 

# instruction for CVs printout 
PRINT W_STRIDE 50 

# the CV: radius of gyration 
RGYR LIST <CA> SIGMA 0.1 
CA-> 

20 22 26 30 32 
CA<- 

# end of the input 
ENDMETA 

The keyword PTMETAD switches on parallel tempering plus metadynamics. All replicas have the same CVs, in this 
case the radius of gyration defined by the group of atoms <CA>. The Gaussian height set by the keyword HEIGHT is 
automatically rescaled with temperature, following Wi — Wq ^- , where i is the index of a replica and Ti its temperature. 
The plugin will produce one C0LVAR file and one HILLS file for each replica. 
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A similar procedure is used to run BE-META. A PLUMED input and a binary topology file must be provided, 
one for each replica. These files must end with the replica index (e.g., META_INP0, META_INP1, ...) and must contain 
all the CVs, in the same order, and the keyword BIASXMD. The first replica (META_INP0) must have the NOHILLS CV 
keyword for all the CVs; the other replicas must switch off the variables not used with a list of keywords NOHILLS CV. 
Also in this case, PLUMED will produce one COLVAR file and one HILLS file for each replica. 



PLUMED performs these basic functions: 

• Initialization and parsing of the input file; 

• Evaluation of the CVs value for a given microscopic configuration; 

• Calculation of the forces coming from the Gaussians deposited along the CVs trajectory - in the case of meta- 
dynamics - or from a fixed/moving restraint acting on the CVs - in case of umbrella sampling/steered MD; 

• Printout of CVs value on COLVAR file and, in the case of metadynamics, of the Gaussians deposited on HILLS 



The initialization of the plugin is done by the routine init_metadyn contained in metadyn. c. This routine is called by 
the main MD code, which communicates to the plugin some critical information such as the number of atoms, masses 
and charges, length of the simulation or timestep. The parsing of the PLUMED input is performed by the routine 
read_restraint in read_restraint . c , which reads the file and, according to the CV chosen (let's say CVname), 
calls a specific parsing routine contained in a file called restraint_CVname . c. 

The second task is controlled mainly by the routine restraint in restraint . c, which receives by the MD code the 
atoms positions at every step of dynamics. This routine calls a specific function, contained in restraint_CVname . c, 
which calculates the CVs value and the derivatives with respect to the coordinates. The same routine, depending on 
the free-energy method chosen, calls a proper function to calculate the force acting on the atoms and controls the 
printout of CVs on the COLVAR file. 

The forces calculated by the plugin are communicated back to the main MD code and added to the internal forces 
before the following integration step. The interaction of PLUMED with the principal code is summarized schematically 
in Fig. [U 



IV. OVERVIEW OF THE SOFTWARE STRUCTURE 



file. 



MD code 



PLUMED routines 




FIG. 1: Schematic representation of the interaction of PLUMED with the main MD code. 
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V. DESCRIPTION OF THE INDIVIDUAL SOFTWARE COMPONENTS 

The plugin package, distributed in a compressed tar archive, has the following directory structure: 

• common_f iles. The directory containing all the basic routines that compose PLUMED. 

• tests. A variety of examples of different CVs and free-energy methods provided with topology and input files 
for GROMACS, NAMD, AMBER (SANDER module) and DL.POLY. These examples, combined with a script 
adapted from CP2K |45j, work also as a regtest for the plugin. 

• patches. A collection of patches to interface PLUMED with different codes (see section IVTl for more details). 

• utilities. Two small utilities written in Fortran: sum_hills and driver. The former is a post-processing 
program which reads the HILLS file produced by the plugin in a metadynamics simulation and returns the free 
energy by summing the Gaussians that have been deposited. The latter is a tool that calculates the value of 
selected CVs along a MD trajectory. It requires a PDB file, a trajectory in DCD format and a file with the 
same syntax of the PLUMED input file. 

• doc. A complete manual with detailed installation instructions for each code. 



VI. INSTALLATION INSTRUCTIONS 

The installation of PLUMED on every supported program is done through an automatic patch procedure specific to 
each code. This is done on the clean code and requires its recompilation. All the patching procedures are illustrated 
in detail in the manual. Currently supported codes are NAMD 2.6, GROMACS 3.3.3 and 4.0.4, DLPOLY 2.16 and 
2.19, SANDER (AMBER 9 version). 



VII. TEST RUNS DESCRIPTION 

In the following we describe a few simple examples of the free-energy methods implemented in PLUMED applied 
to alanine dipeptide in vacuum at 300 K (see Fig. [5]). The AMBER99SB force field [46| has been used throughout all 
the simulations. These test runs have been conducted with either NAMD or AMBER (SANDER module) code. 




FIG. 2: Ball and stick representation of alanine dipeptide (Ace-Ala-Nme) in vacuum. The dihedral angle $ is defined by the 
set of atoms C-N-C a -C while the angle * by N-C a -C-N. 



A. Metadynamics 

Well-tempered metadynamics using the two dihedral angles <E> and * as CVs (see Fig. [5J has been performed with 
the SANDER code included in AMBER 9. The bias factor chosen is 10, the initial Gaussian height 0.1 kcal/mol, the 
width 0.35 rad for both CVs and the deposition stride 1 ps. The total simulation time is 5 ns. The free energy (see 
Fig. [3|) has been reconstructed from the Gaussians deposited during the run using sum_hills, the tool provided in 
the directory utilities. 
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-3 -2 EE 1 2 3 

FIG. 3: Free-energy of the alanine dipeptide as a function of the two dihedral angles <E> and obtained with well-tempered 
metadynamics. Isoenergy lines are drawn every 1 kcal/mol. 



B. Umbrella sampling 

Two-dimensional umbrella sampling on the dihedral angles $ and \P has been performed with NAMD code using 
676 umbrella simulations of 10 ps each and a spring constant of 100 kcalmol -1 rad~ 2 . Umbrellas have been chosen 
in an adaptive way. The WHAM code by Alan Grossfield [47j has been used to produce the free-energy profile (see 
Fig. H. 




FIG. 4: Free-energy of the alanine dipeptide as a function of the two dihedral angles $ and ^ obtained with umbrella sampling. 
Isoenergy lines are drawn every 1 kcal/mol. 
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C. One dimensional umbrella sampling and thermodynamic integration 

One dimensional umbrella sampling on dihedral \& has been performed with NAMD code using 26 windows and 
running a simulation of 20 ps per umbrella (520 ps total). The spring constant used is 100 kcal mol~ l rad~ 2 . Thermo- 
dynamic integration has been completed dragging the dihedral from 7r to — 7r in 504 ps. The resulting free energies 
are shown in Fig. [5l 



8n r 




FIG. 5: Free-energy of alanine dipeptide as a function of the dihedral angle obtained from a one dimensional umbrella 
sampling calculation (full line) and from thermodynamic integration (dashed line). 



VIII. AVAILABILITY 

The plugin can be downloaded from http://merlino.mi.infn.it/plumed. Any questions regarding the installa- 
tion and usage of PLUMED can be posted to the users mailing list at plumed-usersOgooglegroups . com. 

IX. CONCLUSIONS AND OUTLOOK 

In this paper we have presented PLUMED, a plugin aimed at performing the calculation of free energy landscapes 
using a number of state-of-the-art methods such as umbrella sampling, steered molecular dynamics and metadynamics. 
The unique feature of PLUMED is that it can be easily ported to four popular MD codes, namely AMBER, DL_POLY, 
GROMACS and NAMD. In the next future, we plan to further expand this list. The possibility of using PLUMED with 
different host codes will allow people to choose the proper code on the basis of its capabilities (e.g., implicit solvent, 
parallelism, particular force fields), and also taking into account its performance relative to a specific application. 
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